Language modeling for sentence retrieval: A comparison between Multiple-Bernoulli models and Multinomial models
نویسنده
چکیده
In this work we focus on a sentence retrieval task to present a comparison between Language Modeling based on a multi-variate Bernoulli distribution and Language Modeling based on the popular multinomial models. Nowadays, a view on text generation as a multiple Bernoulli process is not predominant in Language Modeling for Information Retrieval but we show how the characteristics of the task are appropriate for a statistical Language Modeling based on a multi-variate Bernoulli distribution.
منابع مشابه
ZIP and data document visualization
Text data modeling has been usually considered with Bernoulli or multinomial event models. Poisson distribution is considered inefficient for text information retrieval. In this work, we propose to incorporate the Zero Inflated Poisson model in the Generative Topographic Mapping algorithm. The modified algorithm is presented as a text document cluster extraction and visualization tool. Experime...
متن کاملA Comparison of Event Models for Naive Bayes Text Classification
Recent approaches to text classification have used two different first-order probabilistic models for classification, both of which make the naive Bayes assumption. Some use a multi-variate Bernoulli model, that is, a Bayesian Network with no dependencies between words and binary word features (e.g. Larkey and Croft 1996; Koller and Sahami 1997). Others use a multinomial model, that is, a uni-g...
متن کاملA Comparison of Event Models for Naive Bayes Text Classi cation
Recent approaches to text classi cation have used two di erent rst order probabilistic models for classi ca tion both of which make the naive Bayes assumption Some use a multi variate Bernoulli model that is a Bayesian Network with no dependencies between words and binary word features e g Larkey and Croft Koller and Sahami Others use a multinomial model that is a uni gram language model with i...
متن کاملRelevance Feedback Models for Recommendation
We extended language modeling approaches in information retrieval (IR) to combine collaborative filtering (CF) and content-based filtering (CBF). Our approach is based on the analogy between IR and CF, especially between CF and relevance feedback (RF). Both CF and RF exploit users’ preference/relevance judgments to recommend items. We first introduce a multinomial model that combines CF and CBF...
متن کاملAn Efficient Computation of the Multiple-Bernoulli Language Model
The Multiple Bernoulli (MB) Language Model has been generally considered too computationally expensive for practical purposes and superseded by the more efficient multinomial approach. While, the model has many attractive properties, little is actually known about the retrieval effectiveness of the MB model due to its high cost of execution. In this paper, we show how an efficient implementatio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005